Building and archiving event web collections: A focused crawler approach

نویسندگان

  • Mohamed M. Farag
  • Edward A. Fox
چکیده

In this paper, we present a new approach for building and archiving web collections about events. Our approach combines the traditional focused crawling technique with event modeling and representation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Focused Crawl of Web Archives to Build Event Collections

Event collections are frequently built by crawling the live web on the basis of seed URIs nominated by human experts. Focused web crawling is a technique where the crawler is guided by reference content pertaining to the event. Given the dynamic nature of the web and the pace with which topics evolve, the timing of the crawl is a concern for both approaches. We investigate the feasibility of pe...

متن کامل

Intelligent Event Focused Crawling

There is need for an integrated event focused crawling system to collect Web data about key events. When an event occurs, many users try to locate the most up-todate information about that event. Yet, there is little systematic collecting and archiving anywhere of information about events. We propose intelligent event focused crawling for automatic event tracking and archiving, as well as effec...

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

Introduction to the Web Archiving and Digital Libraries 2015 Workshop Issue

Our understanding of the past will, to a large extent, depend on our success with Web archiving. WADL 2015 brought together international leaders from industry, government, and academia, who are tackling this important challenge. This special issue includes summaries of twelve presentations on 24 June 2015. It is hoped that these works will stimulate other digital library (DL) and related inves...

متن کامل

Focused Crawls, Tunneling, and Digital Libraries

Crawling the Web to build collections of documents related to pre-specified topics became an active area of research during the late 1990’s, crawler technology having been developed for use by search engines. Now, Web crawling is being seriously considered as an important strategy for building large scale digital libraries. This paper covers some of the crawl technologies that might be exploite...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TCDL Bulletin

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2015